AITopics | rectifier network

We study the complexity of functions computable by deep feedforward neural networks with piecewise linear activations in terms of the symmetries and the number of linear regions that they have. Deep networks are able to sequentially map portions of each layer's input-space to the same output. In this way, deep models compute functions that react equally to complicated patterns of different inputs. The compositional structure of these functions enables them to re-use pieces of computation exponentially often in terms of the network's depth. This paper investigates the complexity of such compositional maps and contributes new theoretical results regarding the advantage of depth for neural networks with piecewise linear activation functions. In particular, our analysis is not specific to a single family of models, and as an example, we employ it for rectifier and maxout networks. We improve complexity bounds from pre-existing work and investigate the behavior of units in higher layers.

artificial intelligence, linear region, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > Canada > Ontario > Toronto (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On the Number of Linear Regions of Deep Neural Networks

Neural Information Processing SystemsMar-13-2024, 06:15:30 GMT

We study the complexity of functions computable by deep feedforward neural networks with piecewise linear activations in terms of the symmetries and the number of linear regions that they have. Deep networks are able to sequentially map portions of each layer's input-space to the same output. In this way, deep models compute functions that react equally to complicated patterns of different inputs. The compositional structure of these functions enables them to re-use pieces of computation exponentially often in terms of the network's depth. This paper investigates the complexity of such compositional maps and contributes new theoretical results regarding the advantage of depth for neural networks with piecewise linear activation functions. In particular, our analysis is not specific to a single family of models, and as an example, we employ it for rectifier and maxout networks. We improve complexity bounds from pre-existing work and investigate the behavior of units in higher layers.

activation, linear region, neural network, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > Canada > Ontario > Toronto (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Nearly-tight bounds on linear regions of piecewise linear neural networks

Hu, Qiang, Zhang, Hao

arXiv.org Artificial IntelligenceNov-1-2018

The developments of deep neural networks (DNN) in recent years have ushered a brand new era of artificial intelligence. DNNs are proved to be excellent in solving very complex problems, e.g., visual recognition and text understanding, to the extent of competing with or even surpassing people. Despite inspiring and encouraging success of DNNs, thorough theoretical analyses still lack to unravel the mystery of their magics. The design of DNN structure is dominated by empirical results in terms of network depth, number of neurons and activations. A few of remarkable works published recently in an attempt to interpret DNNs have established the first glimpses of their internal mechanisms. Nevertheless, research on exploring how DNNs operate is still at the initial stage with plenty of room for refinement. In this paper, we extend precedent research on neural networks with piecewise linear activations (PLNN) concerning linear regions bounds. We present (i) the exact maximal number of linear regions for single layer PLNNs; (ii) a upper bound for multi-layer PLNNs; and (iii) a tighter upper bound for the maximal number of liner regions on rectifier networks. The derived bounds also indirectly explain why deep models are more powerful than shallow counterparts, and how non-linearity of activation functions impacts on expressiveness of networks.

artificial intelligence, linear region, machine learning, (19 more...)

arXiv.org Artificial Intelligence

1810.13192

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

From Deep to Shallow: Transformations of Deep Rectifier Networks

An, Senjian, Boussaid, Farid, Bennamoun, Mohammed, Hu, Jiankun

arXiv.org Machine LearningMar-30-2017

In this paper, we introduce transformations of deep rectifier networks, enabling the conversion of deep rectifier networks into shallow rectifier networks. We subsequently prove that any rectifier net of any depth can be represented by a maximum of a number of functions that can be realized by a shallow network with a single hidden layer. The transformations of both deep rectifier nets and deep residual nets are conducted to demonstrate the advantages of the residual nets over the conventional neural nets and the advantages of the deep neural nets over the shallow neural nets. In summary, for two rectifier nets with different depths but with same total number of hidden units, the corresponding single hidden layer representation of the deeper net is much more complex than the corresponding single hidden representation of the shallower net. Similarly, for a residual net and a conventional rectifier net with the same structure except for the skip connections in the residual net, the corresponding single hidden layer representation of the residual net is much more complex than the corresponding single hidden layer representation of the conventional net.

artificial intelligence, machine learning, rectifier network, (14 more...)

arXiv.org Machine Learning

1703.10355

Country: Oceania > Australia (0.28)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Shattered Gradients Problem: If resnets are the answer, then what is the question?

Balduzzi, David, Frean, Marcus, Leary, Lennox, Lewis, JP, Ma, Kurt Wan-Duo, McWilliams, Brian

arXiv.org Machine LearningFeb-27-2017

A long-standing obstacle to progress in deep learning is the problem of vanishing and exploding gradients. The problem has largely been overcome through the introduction of carefully constructed initializations and batch normalization. Nevertheless, architectures incorporating skip-connections such as resnets perform much better than standard feedforward architectures despite well-chosen initialization and batch normalization. In this paper, we identify the shattered gradients problem. Specifically, we show that the correlation between gradients in standard feedforward networks decays exponentially with depth resulting in gradients that resemble white noise. In contrast, the gradients in architectures with skip-connections are far more resistant to shattering decaying sublinearly. Detailed empirical evidence is presented in support of the analysis, on both fully-connected networks and convnets. Finally, we present a new "looks linear" (LL) initialization that prevents shattering. Preliminary experiments show the new initialization allows to train very deep networks without the addition of skip-connections.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Machine Learning

1702.08591

Country: Europe > Switzerland (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

On the Number of Linear Regions of Deep Neural Networks

Montufar, Guido F., Pascanu, Razvan, Cho, Kyunghyun, Bengio, Yoshua

Neural Information Processing SystemsDec-31-2014

We study the complexity of functions computable by deep feedforward neural networks with piecewise linear activations in terms of the symmetries and the number of linear regions that they have. Deep networks are able to sequentially map portions of each layer's input-space to the same output. In this way, deep models compute functions that react equally to complicated patterns of different inputs. The compositional structure of these functions enables them to re-use pieces of computation exponentially often in terms of the network's depth. This paper investigates the complexity of such compositional maps and contributes new theoretical results regarding the advantage of depth for neural networks with piecewise linear activation functions. In particular, our analysis is not specific to a single family of models, and as an example, we employ it for rectifier and maxout networks. We improve complexity bounds from pre-existing work and investigate the behavior of units in higher layers.

artificial intelligence, linear region, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On the Number of Linear Regions of Deep Neural Networks

Montúfar, Guido, Pascanu, Razvan, Cho, Kyunghyun, Bengio, Yoshua

arXiv.org Machine LearningJun-7-2014

We study the complexity of functions computable by deep feedforward neural networks with piecewise linear activations in terms of the symmetries and the number of linear regions that they have. Deep networks are able to sequentially map portions of each layer's input-space to the same output. In this way, deep models compute functions that react equally to complicated patterns of different inputs. The compositional structure of these functions enables them to re-use pieces of computation exponentially often in terms of the network's depth. This paper investigates the complexity of such compositional maps and contributes new theoretical results regarding the advantage of depth for neural networks with piecewise linear activation functions. In particular, our analysis is not specific to a single family of models, and as an example, we employ it for rectifier and maxout networks. We improve complexity bounds from pre-existing work and investigate the behavior of units in higher layers.

artificial intelligence, linear region, machine learning, (20 more...)

arXiv.org Machine Learning

1402.1869

Country: North America > Canada (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback